AITopics | training data

Collaborating Authors

training data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

From Privacy to Generalization: Linear Max-Information Bounds for DP-SGD

Lampert, Christoph H., Zakerinia, Hossein

arXiv.org Machine LearningMay-27-2026

Understanding the relationship between generalization and privacy remains a central challenge in modern machine learning theory, particularly for deep networks trained by variants of differentially private stochastic gradient descent (DP-SGD). In this work we make progress on this persistent open problem by proving a finite-sample bound on the approximate max-information of DP-SGD that exhibits scaling properties comparable with (Dwork et al, 2015)'s classic result for $ε$-differentially private algorithms, namely at most linear in the dataset size. From our result we obtain a general-purpose PAC-Bayes generalization bound in which the necessary prior distribution can be learned by DP-SGD, as well as a generalization bound for DP-SGD-trained models themselves, with a complexity term that is fully explicit and controlled by the optimization hyperparameters.

artificial intelligence, dp-sgd, machine learning, (15 more...)

arXiv.org Machine Learning

2605.26222

Country:

Europe (0.28)
North America (0.28)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Aerodynamic force reconstruction using physics-informed Gaussian processes

Tondo, Gledson Rodrigo, Kavrakov, Igor, Morgenthal, Guido

arXiv.org Machine LearningMay-22-2026

Accurate modeling of aerodynamic loads is essential for understanding and predicting the responses of complex structural systems. However, these models often rely on simplifications of the true physical forces, introducing assumptions that can limit their accuracy. Validating such models becomes particularly challenging in the presence of noisy or incomplete data. To address this, we introduce a probabilistic physics-informed machine learning approach designed to reconstruct the underlying aerodynamic loads from noisy measurements of structural dynamic responses. The model avoids overfitting, eliminates the need for regularization schemes, and allows for the use of heterogeneous and multi-fidelity data during the training process. The efficacy of the approach is demonstrated through the reconstruction of aerodynamic loads on the Great Belt East Bridge, simulated under a linear unsteady assumption. Results show a strong agreement between true and predicted loads, particularly related to root mean squared errors, magnitude, phase angle and peak values of the signals. The method for load reconstructing holds broad applicability, such as modeling validation, future load estimation, and structural damage prognosis.

artificial intelligence, gaussian process, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1007/978-3-032-15130-8_20

2605.22111

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Middle East > Republic of Türkiye (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Rigorous, Tractable Measure of Model Complexity

Allerbo, Oskar, Schön, Thomas B.

arXiv.org Machine LearningMay-21-2026

One of the most fundamental properties of a machine learning model is its complexity, with applications across topics such as interpretation, generalization, and model selection. Despite its importance, there is no canonical, model-agnostic way to assess a model's complexity. While simple heuristics, such as the number or magnitude of parameters, yield very crude estimates, hyperparameter-based approaches, such as polynomial degree or kernel length scale, do not generalize across model classes. More rigorous methods, including the Vapnik-Chervonenkis dimension (VCD) (Vapnik, 2013), Rademacher complexity (RMC) (Bartlett and Mendelson, 2002), and effective number of parameters (or effective degrees of freedom, ENP) (Efron, 1986), are difficult, or even impossible, to compute in practice, leaving the user to resort to crude bounds and/or approximations. The topic is further complicated by the often overlooked distinction between model and function complexity, where the former sets a ceiling on the latter.

artificial intelligence, complexity, machine learning, (18 more...)

arXiv.org Machine Learning

2605.21167

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Training data attribution in diffusion models via mirrored unlearning and noise-consistent skew

Serrà, Joan, Goswami, Dipam, Morreale, Fabio, Liao, Wei-Hsiang, Mitsufuji, Yuki

arXiv.org Machine LearningMay-19-2026

Training data attribution (TDA) should enable generative model interpretability and foster a variety of related downstream tasks. Nonetheless, current TDA approaches lack reliability and robustness, preventing their adoption in real-world setups. In this paper, we take a decisive step towards more reliable and robust TDA for diffusion models. We propose to perform TDA with mirrored unlearning and noise-consistent skew (MUCS). The idea is to fine-tune a second model with bounded mirrored gradient ascent, and to measure the normalized skew of this model with respect to the original one using consistent noise samples. We show that, while being conceptually simple and generic, MUCS systematically outperforms existing methods on three different datasets by a large margin. We additionally study the effect that core design choices have on final performance, and analyze novel aspects regarding the overlap of influential instances across generated items and the potential of ensembling TDA approaches. We believe that our findings may have broader implications for more general unlearning setups, as well as for tasks requiring the comparison of diffusion losses.

artificial intelligence, diffusion model, machine learning, (14 more...)

arXiv.org Machine Learning

2605.17938

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.46)

Industry: Information Technology > Security & Privacy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A numerical study into neural network surrogate model performance for uncertainty propagation

Wade, Noah, Teferra, Kirubel

arXiv.org Machine LearningMay-18-2026

Neural network surrogate models have emerged as a promising approach to model solution fields for a wide variety of boundary value problems encountered in physical modeling. Stochastic problems represent an area of particularly high interest because of the potential to significantly reduce the repeated evaluation of expensive forward models via traditional numerical solvers when conducting parametric analysis. However, many studies found in the literature primarily focus on the ability of neural network surrogate models to represent deterministic samples or mean field solutions and largely overlook surrogate model performance at the tails of the distribution. The present study examines in detail the ability of neural network surrogate models to capture the full distribution of solution fields over the entire probability space, while emphasis is placed at the tails of the distribution. Serving as a canonical problem is the heat conduction equation with a highly stochastic source term, inducing extremely large variation in the thermal solution field. Comparisons are made between a classic feed-forward fully connected network and a Deep Operator Network architecture, using both data-driven and physics-informed loss functions. Results show that the worst-case prediction errors are an order of magnitude larger than the mean field error, highlighting the importance of the outlier samples. The large errors associated with extreme samples result from the networks having to extrapolate beyond the bounds of the training data. A method for identifying these samples is presented along with a discussion of potential approaches to account of their errors. Among the models considered, the fully connected neural network trained using a weak form residual loss performs best in handling these extrapolated inputs, achieving the highest prediction accuracy for the numerically produced datasets.

artificial intelligence, machine learning, neural network, (15 more...)

arXiv.org Machine Learning

doi: 10.1061/JENMDT/EMENG-8978

2605.16078

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.69)

Industry: Government > Military > Navy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

AI chatbots are giving out people's real phone numbers

MIT Technology ReviewMay-13-2026, 18:09:03 GMT

AI chatbots are giving out people's real phone numbers People report that their personal contact info was surfaced by Google AI--and there's apparently no easy way to prevent it. A Redditor recently wrote that he was "desperate for help": for about a month, he said, his phone had been inundated by calls from "strangers" who were "looking for a lawyer, a product designer, a locksmith." Callers were apparently misdirected by Google's generative AI. In March, a software developer in Israel was contacted on WhatsApp after Google's chatbot Gemini provided incorrect customer service instructions that included his number. And in April, a PhD candidate at the University of Washington was messing around on Gemini and got it to cough up her colleague's personal cell phone number. AI researchers and online privacy experts have long warned of the myriad dangers generative AI poses for personal privacy.

information, machine learning, natural language, (18 more...)

MIT Technology Review

Country:

Asia > Middle East > Israel (0.25)
North America > United States (0.15)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.59)

Add feedback

Extrapolation in Statistical Learning with Extreme Value Theory

Engelke, Sebastian, Gnecco, Nicola, Sabourin, Anne

arXiv.org Machine LearningMay-5-2026

Extreme value theory provides rigorous theory and statistical tools for extrapolation in machine learning, particularly in settings where traditional methods struggle due to data scarcity in the tails. A broad range of tasks benefit from these advances, including regression and classification beyond the training data, extreme quantile regression, supervised and unsupervised dimension reduction, generative artificial intelligence and anomaly detection. This review synthesizes recent developments in these fields at the intersection of statistical learning and extreme value theory, with a focus on principled methods based on asymptotically motivated representations of the tail of univariate and multivariate distributions. We consider different theoretical frameworks for both asymptotically dependent and independent data and discuss how they translate into efficient statistical methods for extrapolation to extreme regions. By addressing both theoretical and practical aspects, we offer a comprehensive overview of the state-of-the-art in this quickly evolving field, and identify promising directions for future research.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

2605.01909

Country: Europe (1.00)

Genre: Overview (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Training step L0L1LT 1W Preprocessing f(x, v) T

Neural Information Processing SystemsMay-1-2026, 02:04:01 GMT

In the following sections, we provide additional details about the network architecture, training, and experiments. The source code and WBC-SPH data set are published at https://github.com/ A.1 Implementation Details We implement our neural network with Tensorflow (https://www.tensorflow.org), They also serve as the basis for the implementation of our antisymmetric CConv (ASCC) layer. Axis for Mirroring As mentioned in the main text, the mirror axis for ASCC layers can be chosen freely while fulfilling the requirements from theory. This provides a degree of freedom for implementation. We decided to use a fixed axis, which in our case corresponds to the spatial y-axis. While the mirroring could potentially be coupled to the spatial content of features, we found that a single, fixed axis for mirroring simplifies the implementation of the ASCCs, and hence is preferable in practice. Additional Modifications In addition to the properties of our algorithm as discussed in Section 2.3 and the ablation study in Section 3, we normalize the input data depending on the given gravitational direction in the model.

artificial intelligence, machine learning, particle, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Ambient Diffusion: Learning Clean Distributions from Corrupted Data

Neural Information Processing SystemsMay-1-2026, 01:31:36 GMT

We present the first diffusion-based framework that can learn an unknown distribution using only highly-corrupted samples. This problem arises in scientific applications where access to uncorrupted samples is impossible or expensive to acquire. Another benefit of our approach is the ability to train generative models that are less likely to memorize any individual training sample, since they never observe clean training data. Our main idea is to introduce additional measurement distortion during the diffusion process and require the model to predict the original corrupted image from the further corrupted image. We prove that our method leads to models that learn the conditional expectation of the full uncorrupted image given this additional measurement corruption. This holds for any corruption process that satisfies some technical conditions (and in particular includes inpainting and compressed sensing). We train models on standard benchmarks (CelebA, CIFAR-10 and AFHQ) and show that we can learn the distribution even when all the training samples have 90%of their pixels missing. We also show that we can finetune foundation models on small corrupted datasets (e.g. MRI scans with block corruptions) and learn the clean distribution without memorizing the training set.

artificial intelligence, diffusion model, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Industry: